Cutting test runtime for src/test/modules/snapshot_too_old

  • Jump to comment-1
    tgl@sss.pgh.pa.us2022-08-02T15:38:22+00:00
    I've complained before that the snapshot_too_old TAP tests seem ridiculously slow --- close to a minute of runtime even on very fast machines. Today I happened to look closer and realized that there's an absolutely trivial way to cut that. The core of the slow runtime is that there's a "pg_sleep(6)" in the test case; which perhaps could be trimmed, but I'm not on about that right now. What I'm on about is that two of the three isolation tests allow the isolationtester to default to running every possible permutation of steps, one of which doesn't even generate the "snapshot too old" failure. IMV it's sufficient to run just one permutation. That opinion was shared by whoever wrote sto_using_hash_index.spec, but they didn't propagate the idea into the other two tests. The attached cuts the test runtime (exclusive of setup) from approximately 30+24+6 seconds to 6+6+6 seconds, and I don't see that it loses us one iota of coverage. I cleaned up some unused tables and bad comment grammar, too. regards, tom lane
    • Jump to comment-1
      robertmhaas@gmail.com2022-08-02T17:28:31+00:00
      On Tue, Aug 2, 2022 at 11:38 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > I've complained before that the snapshot_too_old TAP tests seem > ridiculously slow --- close to a minute of runtime even on very fast > machines. Today I happened to look closer and realized that there's > an absolutely trivial way to cut that. The core of the slow runtime > is that there's a "pg_sleep(6)" in the test case; which perhaps could > be trimmed, but I'm not on about that right now. What I'm on about > is that two of the three isolation tests allow the isolationtester to > default to running every possible permutation of steps, one of which > doesn't even generate the "snapshot too old" failure. IMV it's > sufficient to run just one permutation. That opinion was shared by > whoever wrote sto_using_hash_index.spec, but they didn't propagate > the idea into the other two tests. > > The attached cuts the test runtime (exclusive of setup) from > approximately 30+24+6 seconds to 6+6+6 seconds, and I don't see > that it loses us one iota of coverage. > > I cleaned up some unused tables and bad comment grammar, too. Yeah, I feel like it was a mistake to allow the list of permutations to be unspecified. It encourages people to just run them all, which is almost never a thoughtful decision. Maybe there's something to be said for running these tests in one successful permutation and one failing permutation -- or maybe even that is overkill -- but running them all seems like a poor idea. -- Robert Haas EDB: http://www.enterprisedb.com
      • Jump to comment-1
        tgl@sss.pgh.pa.us2022-08-02T17:50:51+00:00
        Robert Haas <robertmhaas@gmail.com> writes: > Yeah, I feel like it was a mistake to allow the list of permutations > to be unspecified. It encourages people to just run them all, which is > almost never a thoughtful decision. Maybe there's something to be said > for running these tests in one successful permutation and one failing > permutation -- or maybe even that is overkill -- but running them all > seems like a poor idea. Yeah, I considered letting the no-error permutation survive. But I didn't really see what coverage it was adding at all, let alone coverage that'd justify doubling the test runtime. Also ... while doing further research I was reminded that a couple years ago we were seriously discussing nuking old_snapshot_threshold altogether, on the grounds that it was so buggy as to be unsafe to use, and nobody was stepping up to fix it [1][2]. It doesn't appear to me that the situation has got any better, so I wonder if we're prepared to pull that trigger yet. regards, tom lane [1] https://www.postgresql.org/message-id/flat/20200401064008.qob7bfnnbu4w5cw4%40alap3.anarazel.de [2] https://www.postgresql.org/message-id/flat/CA%2BTgmoY%3Daqf0zjTD%2B3dUWYkgMiNDegDLFjo%2B6ze%3DWtpik%2B3XqA%40mail.gmail.com